Greg Detre
Wednesday, October 16, 2002
Sorry to have been dragging my feet over the project proposal. Here�s a preliminary outline of what I hope might be interesting and feasible to try. I would welcome any criticisms or guidance.
I started by reading some of Luc Steels' papers, but they've been a bit disappointing. I like the questions he's addressing, and his methodological framework (Steels, 1997, �Synthesising the origins of language and meaning using co-evolution, self-organisation and level formation�, http://citeseer.nj.nec.com/steels97synthesising.html), but I suppose I'm disappointed by how much he has to build in to his models in order for the self-organisation to work. For instance, there's a paper in which he tries to self-organise a spatial vocabulary within a language community (Steels 1995, �A self-organising spatial vocabulary�, http://arti.vub.ac.be/steels/space.ps), which works, but only within the crippling constraint of pre-specified meaning sets that the vocabulary maps onto. The only self-organisation that is happening is the mapping of the arbitrary vocabulary symbols onto the pre-specified meaning set (of extremely basic spatial terms like �front�, �left� etc.), simply by correlating referential 'gestures' with language-symbol 'utterances'.
I�m still very keen to investigate the space-time representation crossover that we discussed with regard to Boroditsky�s and Regier�s work, but that has had to take a back-burner while I�ve been caught up with other things. I realise that you�d like to keep that project and the MAS962 project fairly separate, but since I have been thinking about space/time representations a lot, I was hoping you wouldn�t mind if I considered that as the domain for the MAS962 project.
Although the Regier & Carlson �above� model is elegant and rigorous, I feel that it might actually be more complicated than necessary for the sort of hypotheses I�d want to consider first. I thought that it might actually be possible to make progress using a very simple model, similar to the one described in Steels (1995) that I mentioned above. The world would consist of a 1D or 2D spatial grid, discrete timesteps, two agents and a series of objects whose changing location they have to describe to each other.
Steels uses the prepositions �front�, �left�, �right� and �back�, but since I�m hoping to shift into the temporal domain eventually, I considered that �near� might be appropriate, simple and fundamental. I�m imagining �nearness� being defined as just the Euclidean distance from an agent to the object. There would be some threshold radius, within which an object would be considered �near�. It may also be necessary to define �far� as the converse of �near�.
I haven�t read that paper on the semantics of �in� yet, but I mean to. I�m hoping that �near� will prove a less problematic notion than you said �in� is.
It might make sense to start with a one-dimensional world � visualise a bridge � with the objects moving towards the agents from one side. They take it in turns to be able to see the objects coming towards them, and to shout a �near� warning to the other agent (one agent holds their shared eyeball, while the other operates their shared pogo stick). When an object becomes �near�, they have to �jump� for the interaction to be a success. Having learned �near�, we might eventually have the objects come from both sides, and use �front� and �back�.
Things get interesting when the speed that the objects approach at can vary. At this point, the time-to-impact is no longer simply a function of distance, but also of speed. This would allow us to either redefine �near� to be �soon� (i.e. a function that combines either far-and-fast, or close-and-slow, or close-and-fast), or to split it into two notions of (physically) �close� and �fast�.
I�ve described things in this careful, limited-seeming way because this seems like the most intuitive way to build a spatial representation that could be co-opted into the temporal domain. I�m sure you have a clearer idea than I do of how the Regier model might be co-opted into the temporal domain, but I had a really hard time conceiving how it might work at all, at least not without adding enormous extra complication. As it is, I think it may be necessary to move to a 2D spatial world, but I thought it might make things simpler initially if the spatial and temporal representations were of the same dimensionality. I�m also initially only thinking in terms of the agents and objects as points.
I also think it may prove necessary to allow the agents freedom of movement in at least one dimension in order to see anything interesting in the temporal domain, especially with regard to perhaps teasing apart ego- vs time-moving representations. At the moment, this is little more than a vague hunch, but it seems difficult to imagine how an ego-moving representation of time would ever emerge without some notion of the ego being able to move. This may have something to do with the fact that a distinction between egocentric and allocentric coordinates only exists when one can move within the world. I presume it will then also be necessary for the agents to be able to move independently of each other, and to represent each other�s position relative to the object (presumably in allocentric) coordinates. I suspect that would be complicated, which is why I don�t want to introduce it into the model until later.
I�m pretty keen for the representations to be connectionist,
at least for the most part. This is partly because I worry that any
predominantly symbolic representation ends up being spoon-fed its structure by
the designer, though I�m aware that much the same could be argued of
connectionist representations too. My preference for connectionism is also
motivated by an attempt to get away from Steels� use of pre-specified meaning
sets. His emphasis is on attaching the vocabulary to these meaning sets by
means of a self-organising population-wide convergence. I see this as analogous
to the way that bees� decision to move hive results in a population-wide
convergence very suddenly � this also seems to be the same sort of basic
self-organisation that Nowak et al. have in mind in their model of evolutionary
population dynamics (2000,
http://www.ptb.ias.edu/plotkin/EvolutionOfSyntax.pdf). One means of achieving
this self-organising of the meaning sets might be to use a reinforcement
learning algorithm, rewarding the agents when they successfully communicate and
avoid a near object, as a means of conveying the meaning of �near�. The
opposite approach would be to use a backpropagation net (say) with an explicit
teacher/target stimulus to convey the sense of each proposition. I can�t see
any obvious advantage to this over a symbolic approach, but it may prove the
most tractable way to start.
Just to defend some of the choices I�ve made philosophically, the assumptions and contentions that I�ve tried to set out for myself are as follows. I�m afraid they�re pretty vague at the moment.
I�m interested in how the adaptive value of language shapes its structure. The notion of �adaptiveness� implies an evolutionary framework, so maybe it�s a misleading term to use. By it, I�m thinking of any approach where the learning algorithm is in some way rewarded/reinforced, and that this derived purpose anchors meaning. Eventually, I�m keen to explore the idea of language games, as a means of exploring what sort of selective pressures (i.e. different tasks) might have been operating to produce human language.
(Much of this is drawn from Steels, amongst others).
Language is emergent/distributed
There�s no centralised controller or single language community leader, and you can�t have a single speaker of a language. When we talk about �Language�, we�re referring to the past and possible future interactions between its speakers. I�m hoping that two speakers will be enough for the dynamics to work.
It �spontaneously forms itself once the appropriate physiological, psychological and social conditions are satisfied� (see Steels, 1997).
Language is too complex to arise out of a generic learning algorithm or cognitive mechanism. There must be some degree of specialisation or hard-wired bias to narrow the search space to make learning the language tractable for new speakers. In this model, most of the �hard-wired bias� will probably be in the form of my tweaking of the agents� cognitive architecture.
Language is the tip of the cognitive iceberg
You can�t speak a language without a rich set of separable but densely inter-related concepts, that you share to a greater or lesser degree with fellow speakers, and are grounded in a shared environment and perceptuomotor mechanisms. Language and conceptual structures are interwoven � they influence each other, but you can�t talk in terms of one (wholly) determining or preceding the other.
A language commmunity is shaped by the experiences, limitations and the nature of embodiment of its individual constituents. It follows then that you can�t have even a rudimentary language without also having a functional, if basic, set of concepts and behaviours.
Language is combinatorial
We can, of course, imagine a system of communication which does not involve combining units. For instance, we can imagine some lookup table of arbitrary symbols, each of which stood for some important �sentence�, and it might be pretty adaptive for its speakers, especially if they had large memories and a basic or stable environment.
However, our world is not stable, and even if it was, it�s too complex for a lookup table to express usefully without being enormous. As a result, we employ syntax as a means of describing a potentially infinite number of possible combinations between concepts.
So although there are certainly non-combinatorial proto-languages, I can�t see how there could be an interesting language that didn�t involve combination, or some analagous form of syntax. Put in other terms, without generative rules that can be applied an arbitrary number of times, the representative/expressive capacity of a language will always be finite and impoverished.
At this stage, the questions
being investigated in this model are preliminary to the question of syntax, but
it may be possible to build up more complex messages such as �close fast
front�.
Whether communication is the main selective pressure behind language, or whether there is some alternative internal benefit to being a language-user.
In this model, communication is the major �selective pressure� for these agents. At this stage, I don�t see how it would be possible to look at the extent to which the proto-language has shaped the concepts it expresses.
I�d really like to consider how questions, statements and imperatives arise and combine to form discourse, but I don�t think it will really be an issue I�ll be able to address here.
At the very least, it would be enormously satisfying and instructive to build a basic one-dimensional world model, even with a language community of two agents, who can self-organise the prepositions �near�, (maybe �far�), �front� and �back�. This wouldn�t be breaking any particularly new ground, although I haven�t yet come across a paper that looks at the preposition �near�.
I feel that the project could take two directions at this point.
I think that investigating the temporal effects would be
more likely to work, and might be potentially more interesting. However, I suspect that the most interesting
effects, like the ego- vs time-moving representations, would require a pretty
advanced cognitive architecture to be seen. Indeed, they must be contingent to
some degree on the particular way in which the human brain represents space,
time and movement, and I can�t yet see how to strip away the extraneous factors
to model them. If we were investigating time in a two-dimensional world, then
it might eventually also be possible to consider horizontal vs vertical
linguistic conceptions of time as well.